2,112 research outputs found

    Diseño e implementación sobre hardware reconfigurable de una arquitectura para la emulación en tiempo real de redes neuronales celulares

    Get PDF
    [SPA] En esta Tesis se propone el diseño y la implementación sobre hardware reconfigurable de una arquitectura para la emulación en tiempo real de redes neuronales celulares (CNN). El proceso de diseño de la arquitectura, comienza con el planteamiento de diferentes métodos de discretización del modelo continuo original de la red CNN. A partir de dichos métodos se obtienen distintas aproximaciones que son simuladas y comparadas entre sí con el fin de comprobar su funcionalidad y determinar cuál de ellas proporciona los mejores resultados con el menor coste computacional. La aproximación con mejores prestaciones es elegida para desarrollar el algoritmo de cómputo que describe la arquitectura hardware de la red CNN. La metodología de desarrollo utilizada, explora diferentes alternativas para optimizar la arquitectura CNN desde el punto de vista de su implementación hardware sobre FPGAs. A partir de la paralelización y adaptación del algoritmo de cómputo se desarrollan dos arquitecturas hardware diferentes denominadas Carthago y Carthagonova. Estas arquitecturas describen el funcionamiento de una Celda CNN, desenrollada en Etapas, que permite emular secuencialmente el procesamiento realizado por las redes CNN. La principal característica de estas arquitecturas es la capacidad que tienen para procesar la información en flujo de datos y en tiempo real. Las soluciones propuestas tiene como principal objetivo conseguir el mejor equilibrio entre la velocidad de procesamiento y el consumo de recursos hardware de la FPGA, así como evitar el uso de dispositivos de memoria externa que reducen la velocidad de procesamiento del sistema e incrementan su tamaño. Se proponen diferentes alternativas para implementar las arquitecturas sobre dispositivos FPGAs. Una de ellas consiste en utilizar una técnica de sincronización self-timed, eficiente en área-tiempo, que es definida mediante un lenguaje de descripción hardware tradicional (VHDL), instanciando primitivas de bajo nivel y realizando el emplazamiento de los componentes de forma manual. Otra alternativa consiste en una descripción en VHDL estructural a nivel RTL y sincronización convencional, donde los componentes self-timed son sustituidos por componentes estándar. Se propone además la implementación de una de las arquitecturas sobre un computador reconfigurable de altas prestaciones (HPRC), compuesto por un microprocesador de propósito general y un coprocesador basado en FPGAs, encargado de acelerar la ejecución de los algoritmos mediante hardware. El particionamiento hardware/software y el proceso de co-diseño se realizan usando las herramientas de desarrollo a nivel de sistema (ESL) de Impulse Accelerated Technologies (Impulse-C) y la plataforma HPRC DS1002 de DRC Computers. Los principales resultados obtenidos de las diferentes implementaciones son mostrados con el fin de demostrar la funcionalidad de las arquitecturas y analizar sus principales prestaciones. Las diferentes combinaciones consideradas, entre técnicas de implementación y las arquitecturas propuestas, muestran que la arquitectura Carthagonova, implementada a nivel estructural, presenta importantes ventajas a considerar. En primer lugar, la arquitectura facilita la emulación de redes CNN complejas, compuestas por cientos de miles de millones de neuronas, sobre sistemas embebidos basados en FPGAs. En segundo lugar, el excelente compromiso alcanzado entre velocidad de procesamiento y consumo de recursos hardware hace que sea una interesante solución a considerar frente a otras alternativas de la literatura. Finalmente, la versatilidad y las prestaciones de la arquitectura diseñada permiten dar soporte al desarrollo de sistemas de procesamiento de vídeo en tiempo real y al diseño de aplicaciones basadas en modelos neuronales bioinspirados. La arquitectura CNN propuesta es utilizada para desarrollar un modelo artificial de la primera sinapsis de la retina, incorporando algunas de las principales características de los circuitos neuronales considerados. El modelo está basado en los campos receptores de las células bipolares y su objetivo es emular, mediante hardware reconfigurable, el procesamiento espacial básico realizado por la retina. Al igual que ocurre en la primera sinapsis de la retina, se observa que el modelo artificial propuesto lleva a cabo la detección del contraste y la discriminación visual de detalles en función de la influencia de los factores de convergencia y de inhibición lateral de los circuitos neuronales implementados. Finalmente, se propone el diseño y la implementación de un sistema de cómputo distribuido, basado en múltiples FPGAs, que permite el desarrollo de aplicaciones embebidas de procesamiento de vídeo en tiempo real con redes CNN multi-capa (ML-CNNs) complejas y de gran tamaño. El sistema procesa la información de vídeo en flujo de datos (en modo progresivo) y proporciona una salida de vídeo estándar compatible con el formato VGA industria. [ENG] This thesis proposes the design and development of a hardware architecture for real-time emulation of non-linear multilayer Cellular Neural Networks (CNN). This approach is focused on CNN implementation on reconfigurable hardware architectures. The architecture design begins from a discrete model obtained from different transformations made to the original continuous model of CNN. Each discrete approach is simulated and compared with the rest of approaches in order to verify their functionality and to find the approach which best emulates the continuous model at minimum computational cost. The best discrete model found is then used to develop the hardware architecture of the CNN. The development methodology used, explores different alternatives to optimize the architecture from the point of view of its hardware implementation on FPGAs. The architectures Carthago and Carthagonova are developed from the hardware adaptation and parallelization of the sequential algorithm that describes the functionality of the selected CNN discrete model. These architectures are based on an unrolling cell which is employed to emulate CNNs with large number of neurons. The key characteristic of these architectures is their capability to process information in real time in a sequential manner. The proposed solution aims to find a suitable tradeoff between area and speed, reducing the use of hardware resources on the FPGA and avoiding the use of external memory devices which make slower the processing rate and higher size and cost. We propose different solutions to the internal implementation of both architectures on an FPGA. The first one is a novel self-timed architecture, areatime efficient. It is described using traditional Hardware Description Languages (HDL) from low level hardware primitives instantiation and manual placement. The second one consists in a high level description in structural VHDL using conventional synchronization, instead of self-timed blocks. We also propose an implementation architecture that makes use of a High Performance Reconfigurable Computer (HPRC), combining general purpose microprocessors with custom hardware accelerators based on FPGAs, to speed up execution time. The hardware/ software partitioning and co-design process are carried out using high level design tools. The architecture Carthago is implemented using Electronic System Level (ESL) tools from Impulse Accelerated Technologies and the DS1002 HPRC platform from DRC Computers. The most relevant results obtained from different implementations are shown in order to verify the functionality of the proposed neural hardware architectures and to analyze their performance. The best combination of architecture and implementation model, the Carthagonova-structural, presents some important advantages. Firstly, the architecture is developed to help emulating highperformance discrete CNNs with hundreds or millions of neurons on embedded FPGA-based systems. Secondly, due to its balanced trade-off between speed and area, this architecture is an interesting alternative to consider among others in the literature. Finally, its versatility facilitates the export of neural hardware architecture to applications such as signal and image processing, implementation of the neurological models inspired on human systems, etc. The hardware architecture proposed is used to build a CNN-based model of the fist synapse of the retina, which incorporates the main neural circuits found in the different retinal regions. The aim of this bioinspired model is to implement the basic spatial processing of the retina in reconfigurable hardware. The model is based on the bipolar cells receptive fields and mimics the retinal architecture achieving its processing capabilities. As occurs in the processing of first synapse of the retina, it is observed that contrast detection and detail resolution are influenced by the convergence factor of neurons and by lateral inhibition, which are specific parameters of each neural circuit. We also propose the design and implementation of an embedded system based on multiple FPGAs that can be used to process real time video streams for applications that require the use of large Multi-Layer CNNs (ML-CNNs). The system processes video in progressive mode and provides a standard VGA output format. The main features of the system are determined by using a distributed computing architecture, based on Processing Modules (PM), which facilitates system expansion and adaptation to new applications. Several FPGA-based processing modules can be cascaded together with a video acquisition stage and an output interface to a frame grabber for video output storage, all sharing a common communication interface. Each PM is composed by an FPGA board that can hold one or more CNN layers. The total computing capacity of the system is determined by the number of MP used and the amount of resources available in the FPGAs. The pre-verified CNN components, the modular architecture, and the expandable hardware platform provide an excellent workbench for fast and confident developing of CNN applications, based on traditional cloned templates, but also time-variant and space-variant templates.Universidad Politécnica de Cartagen

    Discrete-time cellular neural networks in FPGA

    Get PDF
    This paper describes a novel architecture for the hardware implementation of non-linear multi-layer cellular neural networks. This makes it feasible to design CNNs with millions of neurons accommodated in low price FPGA devices, being able to process standard video in real time.This research has been funded by MTyAS of Spain, IMSERSO RETVIS 150/06

    Hand-based interface for augmented reality

    Get PDF
    Augmented reality (AR) is a highly interdisciplinary field which has received increasing attention since late 90s. Basically, it consists of a combination of the real scene viewed by a user and a computer generated image, running in real time. So, AR allows the user to see the real world supplemented, in general, with some information considered as useful, enhancing the users perception and knowledge of the environment. Benefits of reconfigurable hardware for AR have been explored by Luk et al. [4]. However, the wide majority of AR systems have been based so far on PCs or workstation

    HANNA: a tool for hardware prototyping and benchmarking of ANNs. Poster

    Get PDF
    For some applications, designers must implement an ANN model over different platforms to meet performance, cost or power constrains, a process still more painful when several hardware implementations have to be evaluated. Continuous advances in VLSI technologies, computer architecture and software development make it difficult to find the adequate implementation platform. HANNA (Hardware ANN Architect), is a tool designed to automate the generation of hardware prototypes of MLP-like neural networks over FPGA devices. Coupled with traditional Matlab/Simulink environments the model can be synthesized, downloaded to the FPGA and co-simulated with the software version to trade off area, speed and precision requirements.This research is being funded by Ministerio de Ciencia y Tecnología TIC 2003-09557-C02-02

    A multi-FPGA distributed embedded system for the emulation of Multi-Layer CNNs in real time video applications

    Get PDF
    This paper describes the design and the implementation of an embedded system based on multiple FPGAs that can be used to process real time video streams in standalone mode for applications that require the use of large Multi-Layer CNNs (ML-CNNs). The system processes video in progressive mode and provides a standard VGA output format. The main features of the system are determined by using a distributed computing architecture, based on Independent Hardware Modules (IHM), which facilitate system expansion and adaptation to new applications. Each IHM is composed by an FPGA board that can hold one or more CNN layers. The total computing capacity of the system is determined by the number of IHM used and the amount of resources available in the FPGAs. Our architecture supports traditional cloned templates, but also the (simultaneous) use of time-variant and space-variant templates.This work has been partially supported by the Fundación Séneca de la Región de Murcia through the research projects 08801/PI/08 and 08788/PI/08, and by the Spanish Government through project TIN2008-06893-C03

    Incorporación de actividad sísmica no reflejada en el catálogo en métodos no zonificados

    Full text link
    Las evaluaciones de la peligrosidad sísmica suelen apoyarse en la información del catálogo. El período de retorno de los cálculos probabilistas debe ser consistente con las tasas de ocurrencia representadas en la caracterización de la actividad. En la metodología de Gutenberg-Richter (GR) se fija una magnitud máxima de integración que acota la validez de la ley de GR, siendo frecuente que el rango cubierto por la ley de GR exceda al cubierto por el catálogo que la soporta. En un método no zonificado la tasa de actividad se construye a partir de los eventos del catálogo y es sólo representativa de dichos eventos, aunque la incorporación de incertidumbres y el adecuado manejo de los períodos efectivos pueden incrementar los períodos de retorno respecto al rango cubierto por el catálogo. Se expone aquí una estrategia para incorporar información paleosísmica en una metodología basada en estimadores de densidad kernel

    Desarrollo de un equipo didáctico para prácticas en asignaturas de control e identificación

    Get PDF
    En este artículo se presentan los detalles de un equipo didáctico desarrollado por los autores bajo las especificaciones dadas por los profesores del departamento DISA de la UPCT. Se ha tratado de crear un equipo que permita realizar el más amplio abanico de prácticas en las asignaturas de regulación y control de sistemas de este departamento. Además se hizo un esfuerzo especial en conseguir un equipo sencillo de manejo, robusto, fiable, seguro, versátil y de arquitectura abierta, características que a nuestro entender hay que cumplir para conseguir un buen equipo didáctico. En este articulo se muestran los detalles constructivos así como sus posibilidades como equipo de prácticas en las asignaturas de automática.Los autores desean mostrar su agradecimiento al Departamento de Ingeniería de Sistemas y Automática de la UPCT por su apoyo y apuesta por una iniciativa de este tipo

    Herramienta para la implantación hardware de controladores sobre FPGAs

    Get PDF
    Barcelona, 12-14 de septiembre de 2001Deseamos agradecerle al departamento de electrónica, tecnología de computadoras y proyectos su apoyo técnico en la realización de este proyecto. Además deseamos agradecer al CEDETEL su apoyo económico

    A library-based tool to translate high level DNN models into hierarchical VHDL descriptions

    Get PDF
    This work presents a tool to convert high level models of deep neural networks into register transfer level designs. In order to make it useful for different target technologies, the output designs are based on hierarchical VHDL descriptions, which are accepted as input files for a wide variety of FPGA, SoC and ASIC digital synthesis tools. The presented tool is aimed to speed up the design and synthesis cycle of such systems and provides the designer with certain capability to balance network latency and hardware resources. It also provides a clock domain crossing to interface the input layer of the synthesized neural networks with sensors running at different clock frequencies. The tool is tested with a neural network which combines convolutional and fully connected layers designed to perform traffic sign recognition tasks and synthesized under different hardware resource usage specifications on a Zynq Ultrascale+ MPSoC development board.This work has been partially funded by Spanish Ministerio de Ciencia e Innovación (MCI), Agencia Estatal de Investigación (AEI) and European Region Development Fund (ERDF/FEDER) under grant RTI2018-097088-B-C33

    Seismic hazards of the Iberian Peninsula - evaluation with Kernel functions

    Get PDF
    The seismic hazard of the Iberian Peninsula is analysed using a nonparametric methodology based on statistical kernel functions; the activity rate is derived from the catalogue data, both its spatial dependence (without a seismogenetic zonation) and its magnitude dependence (without using Gutenberg–Richter's law). The catalogue is that of the Instituto Geográfico Nacional, supplemented with other catalogues around the periphery; the quantification of events has been homogenised and spatially or temporally interrelated events have been suppressed to assume a Poisson process. The activity rate is determined by the kernel function, the bandwidth and the effective periods. The resulting rate is compared with that produced using Gutenberg–Richter statistics and a zoned approach. Three attenuation laws have been employed, one for deep sources and two for shallower events, depending on whether their magnitude was above or below 5. The results are presented as seismic hazard maps for different spectral frequencies and for return periods of 475 and 2475 yr, which allows constructing uniform hazard spectra
    corecore